Skip to content

[Analysis] Clamp SelectOp divisibility when condConstancy reduces output contiguity#10359

Merged
lezcano merged 4 commits into
triton-lang:mainfrom
Manas103:fix/10067-select-divisibility
May 27, 2026
Merged

[Analysis] Clamp SelectOp divisibility when condConstancy reduces output contiguity#10359
lezcano merged 4 commits into
triton-lang:mainfrom
Manas103:fix/10067-select-divisibility

Conversation

@Manas103
Copy link
Copy Markdown
Contributor

@Manas103 Manas103 commented May 22, 2026

Fixes #10067.

In SelectOpAxisInfoVisitor's tensor-cond branch, the call to getDivisibilityFromContiguity only sees the lhs/rhs contiguities and can overestimate divisibility when condConstancy further reduces the output contiguity below either input's contiguity.

Concrete example

  • lhs: [8, 9, 10, 11, 12, 13, 14, 15] (c=8, d=8)
  • rhs: [16, 17, 18, 19, 20, 21, 22, 23] (c=8, d=16)
  • Element-wise condition, so condConstancy = 1

Output contiguity collapses to gcd(8, 8, 1) = 1 (every position is a leader). But getDivisibilityFromContiguity sees c_lhs == c_rhs == 8 and returns gcd(8, 16) = 8 — without accounting for condConstancy. The output value at position 1 may be 17, which is not divisible by 8.

Why this didn't blow up

On the current pow2 lattice, gcd == min on powers of 2. Codegen's vec_width = min(c, d/e, ...) is bounded by contiguity, and contiguity is computed correctly. So the overestimated divisibility is never the binding constraint on vec_width. This is a latent soundness regression introduced by #7781 — the pre-#7781 code clamped divisibility against the just-computed output contiguity, and that was sound independent of the pow2 invariant.

Fix

A conditional GCD with the output contiguity at the SelectOp callsite. The clamp fires only when condConstancy (or another shrinking factor) reduces the output contiguity strictly below at least one input's contiguity — i.e. it preserves the existing semantics when condConstancy is non-binding.

The helper getDivisibilityFromContiguity is left unchanged (its other callers — MaxMinOpAxisInfoVisitor and AxisInfo::join — don't have a condConstancy-equivalent, so the same gap doesn't exist there).

Test

Added select_cond_constancy_clamps_divisibility in test/Analysis/test-alignment.mlir. The test fails before the fix (divisibility = [8]) and passes after (divisibility = [1]).

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because FILL THIS IN.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

…put contiguity

In SelectOpAxisInfoVisitor's tensor-cond branch, the call to
getDivisibilityFromContiguity sees only the lhs/rhs contiguities and can
overestimate divisibility when condConstancy further reduces the output
contiguity below either input's contiguity.

Example: lhs c=8 d=8, rhs c=8 d=16, condConstancy=1. Output contiguity
collapses to 1 (every position is a leader), but the helper returns
gcd(8, 16) = 8 because c_lhs == c_rhs. The output value at position 1
may be 17, not divisible by 8.

This is latent on the current pow2 lattice (gcd == min, and codegen
vec_width is capped by contiguity, which is computed correctly), but it
is a soundness regression introduced by triton-lang#7781.

Fix is a conditional GCD with the output contiguity at the SelectOp
callsite, preserving the existing semantics when condConstancy does not
bind.

Fixes triton-lang#10067.
@Manas103 Manas103 marked this pull request as ready for review May 22, 2026 17:53
@Manas103 Manas103 requested a review from ptillet as a code owner May 22, 2026 17:53
Comment thread lib/Analysis/AxisInfo.cpp Outdated
// divisible only by gcd(d_src, p) <= gcd(d_src, outContig). Clamp
// divisibility by output contiguity to keep this sound.
// getDivisibilityFromContiguity itself does not see condConstancy.
int64_t div = getDivisibilityFromContiguity(lhsInfo, rhsInfo, d);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify this as the following?

int64_t div =
    gcd(getDivisibilityFromContiguity(lhsInfo, rhsInfo, d),
        contiguity.back());

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 2fb17bd. Lit suite + pre-commit clean.

@lezcano
Copy link
Copy Markdown
Contributor

lezcano commented May 26, 2026

Are the failrues related?

FAILED python/triton_kernels/tests/test_matmul.py::test_op[None-True-True-False-False-pad_a-128-64-128-96-ragged-mxfloat8_e4m3fn-bfloat16-bfloat16-10-1-True-False-False-False-None-False-False-False-True-None] - RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x1024 and 87x128)
FAILED python/triton_kernels/tests/test_matmul.py::test_op[None-True-True-False-False-pad_a-128-64-128-96-ragged-mxfloat8_e4m3fn-float16-bfloat16-10-1-True-False-False-False-None-False-False-False-True-None] - AssertionError

@Jokeren
Copy link
Copy Markdown
Contributor

Jokeren commented May 26, 2026

Are the failrues related?

FAILED python/triton_kernels/tests/test_matmul.py::test_op[None-True-True-False-False-pad_a-128-64-128-96-ragged-mxfloat8_e4m3fn-bfloat16-bfloat16-10-1-True-False-False-False-None-False-False-False-True-None] - RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x1024 and 87x128)
FAILED python/triton_kernels/tests/test_matmul.py::test_op[None-True-True-False-False-pad_a-128-64-128-96-ragged-mxfloat8_e4m3fn-float16-bfloat16-10-1-True-False-False-False-None-False-False-False-True-None] - AssertionError

No, I debugged and there's an IMA in triton_kernels. Trying to isolate the commit that caused this problem now

@lezcano lezcano enabled auto-merge (squash) May 26, 2026 12:15
@lezcano lezcano merged commit fb2ee67 into triton-lang:main May 27, 2026
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SelectOp AxisInfo: getDivisibilityFromContiguity ignores condConstancy after #7781

3 participants